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Symmetry and Universality in Language Change 

Richard A Blythe 


Abstract We investigate mechanisms for language change within a framework 
where an unconventional signal for a meaning is hrst innovated, and then subse¬ 
quently propagated through a speech community to replace the existing convention. 
We appeal to the notion of universality as it applies to complex interacting systems 
in the physical sciences and which establishes a link between generic (‘universal’) 
patterns at the macroscopic scale and relates them to symmetries at the microscopic 
scale. By relating the presence and absence of specihc symmetries to fundamen¬ 
tally distinct mechanisms for language change at the level of individual speakers 
and speech acts, we are able to draw conclusions about which of these underlying 
mechanisms are most likely to be responsible for the changes that actually occur. 
Since these mechanisms are typically believed to be common to all speakers in all 
speech communities, this provides a means to relate universals in individual be¬ 
haviour to language universals. 


1 Three Notions of Universality in Language Change 


Language is a system of behaviour that is acquired by social learning, that is, by 
learning from other members of a social group as opposed to a process of indi¬ 
vidual exploration m. On the face of it, the social interactions where a linguistic 
behaviour is transmitted from on individual to another are highly specihc. Each in¬ 
teraction could depend on the the goals of the participants in the interaction, their 
own individual history of usage, the relative social standing of the individuals in¬ 
volved, to name just three factors that have been discussed in the literature |l2][3][4l. 
Nevertheless, when one looks at the system that arises from these repeated social 
interactions, common patterns emerge. 
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Some of these patterns relate to the structure of language itself. For example, 
typological surveys show that although six different orderings of the subject (S), 
verb (V) and object (O) are possible, two particular orderings (SOV and SVO) are 
much more common than any of the others (see Feature 81A in Q). Other pat¬ 
terns relate to how languages change over time, in particular those cases where one 
conventional signal for a meaning is replaced by another 0. A number of these 
linguistic patterns are surveyed in 0. These include the ‘male lag’, which relates 
to the common observation that when a change is in progress, it is the females who 
lead the change (i.e., are less likely to be users of the outgoing convention). Mean¬ 
while, when partitioning language users by age, rather than gender, one typically 
encounters an ‘adolescent peak’, whereby the age group leading the change is not 
the very youngest, but the adolescents. Finally, the frequency of the new convention 
as a function of time tends to follow an S-curve, that is starting slowly, then accel¬ 
erating, before tailing off as the old convention is eliminated. Indeed, this pattern is 
seen not only in language change, but also in other types of cultural change, such 
as the adoption of a technological innovation Q. All of the phenomena described 
in this paragraph might be described as universal, in the sense that they have been 
observed in different social groups at different times, and in some cases even across 
more than one type of cultural behaviour. 

This however is not the only possible notion of universality that relates to lan¬ 
guage change (or cultural evolution more generally). The linguistic behaviour that is 
displayed and transmitted in social interactions is determined to some extent by the 
cognitive and physical apparatus possessed by the interacting agents. For example, 
in the case of word order, it is possible that sentences that have the subject first are 
easier for humans to process than other types of sentence, which would be expected 
to lead to those subject-first sentences being more common across the world’s lan¬ 
guages. A variety of such linguistic principles have been proposed; see e.g. ilfora 
discussion in a psychological context. Likewise, articulatory or auditory constraints 
may cause certain vocalisations to be more easily produced or understood than oth¬ 
ers 0. The crucial point is that these constraints are assumed to be common to all 
language users, no matter which social group they belong to; in this sense (and one 
that is distinct to the above) these abilities are universal. 

It is natural to expect some sort of link between these two types of universals; 
that is, to propose that the origin of universal patterns of cultural evolution lies in the 
universal constraints that underpin the social interactions and social learning. What 
is unclear is whether the relationship is simple and transparent. In this case, every 
phenomenon that is seen at the macroscale would be directly observable in individ¬ 
ual interactions. On the other hand, the relationship might be rather more complex, 
arising from multiple biases and the fact that the behaviour has been acquired and 
reproduced multiple times. Experimental work provides evidence in favour of both 
positions (e.g., caini), which is perhaps not surprising since they are not mutually 
exclusive. 

One tool that is becoming increasingly widely used to understand the link be¬ 
tween universals at the individual and population level is mathematical modelling 
of complex interacting agent systems mu [nun. Here, a great deal of intuition 
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is drawn from the experience of modelling physical systems of interacting parti¬ 
cles (atoms and molecules) that collectively form macroscopic structured materials 
(for example, metals). In this context, one encounters a notion of universality that 
is again distinct to the two cited above. Roughly speaking, this notion pertains to 
the link between individual-level and collective behaviour, and in this work we shall 
draw inspiration from it to understand how the way in which a language change 
propagates can be related to individual behaviour. 

It is instructive to discuss briefly a concrete example of universality in condensed 
matter physics to elucidate our approach. Magnetic materials are characterised by a 
Curie temperature, below which they exhibit permanent magnetism iia. For iron, 
the Curie temperature is 770°C, which is why an iron bar serves as a good choice 
for a bar magnet in a child’s chemistry sej^ At a distance AT below the Curie tem¬ 
perature, the strength of the magnet increases as a power law (AT)^ (at least in 
the range where AT is small) M- It is this exponent p that is universal: it has 
the same numerical value for a wide variety of magnets with different microscopic 
structures. For example, model magnets whose component parts interact with dif¬ 
ferent strengths or different ranges, or have different spatial arrangements, all have 
the same power-law exponent P EH. 

We can now state more precisely what is meant by universality in this context. 
It applies when some macroscopic phenomenon is observed independently of the 
details of the interactions between the component parts as long as these interactions 
are consistent with a certain set of general principles. In condensed matter physics, 
these principles relate to the symmetry of the system M- In the example of the 
magnet, the relevant symmetry property is that the interactions are unaffected if one 
exchanges all north and south poles of the microscopic magnets that collectively 
form the macroscopic magnet. 

In the remainder of this article, we will see how similar ideas relating to sym¬ 
metry in linguistic interactions between speakers can be used both to predict the 
emergent dynamics of language change, and to categorise different theories for the 
factors that may influence individual behaviour. As we will see, various types of 
asymmetry are possible, and each corresponds to a characteristic pattern of language 
change, only some of which are consistent with the universal S-curve of language 
change mentioned above (and discussed in further detail below). While the main 
results outlined here were established in the context of a specific model in Ref. C3, 
we offer here a much broader perspective than was achieved in this earlier work. In 
particular, we present some new general results that apply to a wide range of models 
that respect the relevant symmetries while differing in detail. As such, they under¬ 
line the utility of considerations based on symmetry as a means to understand the 
behaviour of complex interacting systems outside the physical sciences. 


* One may ask what a magnet is doing in a ‘chemistry set’, given that magnetism is physics, but 
this is beyond the scope of this article. 
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2 Asymmetry in Language Change: The Universal S-Curve 

We begin by stating more precisely the properties of the universal S-curve of lan¬ 
guage change, and highlight why this points towards some underlying asymmetry 
in the system. Throughout this work, we will always have in mind the the case of a 
language change in which the conventional signal for a specihc meaning is replaced 
over time with a new signal. Specihc examples include the marking of the future 
tense in Brazilian Portuguese oa, negation in French ll20l and the word used by 
English speakers in Canada to refer to the item of furniture that I (and the people 
I typically interact witl0 call a ‘sofa’ ifTSl . In each of these cases, the frequency 
that the incoming variant (‘couch’) is used follows an S-curve trajectory; the rate 
of growth initially accelerates until both incoming and outgoing variant are widely 
used, after which the rate of growth decelerates as the incoming variant becomes es¬ 
tablished as a convention (i.e., a variant that is used by a large majority of speakers). 
The empirical data for Canadian furniture terms is shown in Fig[T] where the rise of 
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Fig. 1 Variation in frequency of different furniture terms in Canadian English. Points connected 
with dotted lines correspond to empirical data from (D . This was an apparent-time study, meaning 
that speakers of different ages were surveyed. In this framework, older speakers are assumed to be 
representative of typical behaviour at an appropriate point in the past. Hence plotting the data as a 
function of decreasing age gives an estimate of the real-time change trajectory for these data. Solid 
lines are fits to the functions Q; the thickest line is that with the largest growth rate, and is the 
ultimately winning variant that follows the S-curve. 


^ An exception is my three-year old child, who has elected for ‘couch’. 
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‘couch’ has been fit to an idealised logistic S-curve that has been discussed widely 
in the historical and sociolinguistics literature (see e.g. ||2l]|22l[3l|23l|Ml). 

The case of Canadian furniture terms is an interesting one as multiple variants 
are in simultaneous competition. Among the older speakers, ‘sofa’ and ‘couch’ are 
the most used low-frequency variants, and indeed given this information, one might 
expect ‘sofa’ to be the term most likely to displace the existing convention (‘chester¬ 
field’). This shows that there are (at least) two factors that determine the dynamics 
of one of the variants over time: its initial frequency /,, and its rate of growth, i, . 
In the absence of competition, the frequency grows as Xi{t) — The effect of 
competition is included by ensuring that all the frequencies sum to 1: — 1- 

Then, 


Xi{t) 






( 1 ) 


Depending on the initial frequencies and growth rates, one can arrive at a variety 
of different shapes of curve, as shown in Figure [T] The key point is that the variant 
with the largest growth rate (i,) will eventually saturate to x, (f) = 1 and, if it starts 
at low frequency, will typically follow the characteristic S-shaped curve. 

Another way to understand the S-curve—and in particular its symmetry prop¬ 
erties—is to take a dynamical systems theory view. The rate of change of the variant 
frequency when it is close to 0% and 100% is sufficiently small that it can be ide¬ 
alised to zero. This implies that these are fixed points of the dynamics. However, the 
initial state is an unstable fixed point (repulsive) while the final state is stable (at¬ 
tractive). Thus there is an asymmetry in the stability of these two fixed points, which 
in turns points towards some underlying asymmetry in the system of linguistically- 
interacting agents. As we will see in the following, there are a number of ways in 
which this asymmetry may be generated: however, not all of them are equivalent in 
terms of the language change trajectories that arise. 

We emphasise that our paradigm throughout this work is the case where an ex¬ 
isting convention is being replaced by an innovative variant signal for the same 
meaning. The innovation process itself is not directly modelled; rather, it is implicit 
in the initial condition, which will be a very low (but nonzero) frequency for the 
innovation. 


3 Language Change with No Asymmetry 

For orientation, we ask the following question: What would language change look 
like if there are no asymmetries at all? This is a very strong requirement. First, every 
member of a speech community must behave identically. Every group of speakers 
that interacts—^be this in pairs, triads or larger units—must interact with the same 
frequency, and each speaker must react in the same way to the behaviour of the 
speakers they interact with. They must also give no preference to any of the variants 
(e.g., different words for ‘sofa’) over any other that they are exposed to. This already 
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shows that there are at least three ways to generate asymmetry, and we shall consider 
them all below. 

Suppose the innovation (incoming variant) is used in the speech community with 
a frequency x. Here, by frequency we mean a number between 0 and 1 which cor¬ 
responds to the fraction of utterances where a specific meaning is being expressed 
in which the innovative signal is used. By implication, we have that the convention 
has a frequency I —x. 

If no asymmetries are allowed, then all speakers can do is produce each variant 
in proportion to what they have heard. This means that the expected rate of change 
of any variant is zero: that is, the innovation frequency x is just as likely to go up as 
to go down in any time interval. 

We can demonstrate this result by appealing a fairly general mathematical model 
(and one that we will modify in later sections to explore the link between symmetry 
and the resulting language change process). Let Gij be the probability that agents i 
and j interact in a time interval lasting 5t. The frequency of the innovation experi¬ 
enced by speaker i over this time interval is 


_ ’Lj^jGijXj 


( 2 ) 


The quantity appearing in the denominator here, G, = Gij is the total probabil¬ 
ity that agent i interacts with another speaker in the time interval 5t. 

Now, if speaker i participates in an interaction in this time interval (an event that 
occurs with probability G,), then we suppose that it updates its usage frequency to 
equal some average of its existing value x, and the frequency x, observed in the in¬ 
teraction. Otherwise, if it does not interact (probability 1 — G,), the usage frequency 
remains unchanged. That is, the mean value of a speaker /’s usage frequency after 
an interaction, xj, is 


Xi = Gi[axi -I- (1 - a)xi] -I- (1 - GfjXi = x, -f G,(l - a)(x,' -x,) (3) 


where a is a number between 0 and 1 that specifies how resistant a speaker is to 
change. In the case a — 0, a speaker immediately accommodates to the usage fre¬ 
quency of its interlocutors; in the case a = 1 it never changes. Since there are no 
asymmetries, a is the same for all speakers, and we avoid the pathological (and 
uninteresting) case of no change, a = 1. 

We can now work out what the overall frequency of the innovation is after all 
speakers have updated their individual frequencies. We find 

x' = ^ J^x'i = [xi + Gi{l - a){xi-Xi)] =x+ ^-^J^Gfxi-Xi) (4) 

where N is the number of speakers. Notices that strictly speaking what we have 
calculated here is the mean frequency of the innovation in the population, where the 
average is over all possible interactions that might happen in the time interval dt. 
For large speech communities, we are justified in ignoring fluctuations which are 
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expected to be of order 1 /^/N (unless we find that the expected changes in x are 
themselves of similarly small magnitude). 

Now, symmetry demands that ^ij — —the frequency that i interacts with j 
equals the frequency that j interacts with i. (Actually, this will always be true, al¬ 
though the response to the interaction need not be symmetric, as discussed below). 
This has the following important consequence: 


52 ~ ^ 52 ^‘j^j ~ 52'^7 52 ~ 52 ~ 52 

' '■ ii=i J ¥j J ii=j j ‘ 


Using this in Q, we find 
1 - a 


x' =x + - 


N 


Y^GiXi-Y^GiXi 


1 - a 
N 


Y^GiXi-Y^GiXi 


= x. (6) 


This shows that the expected frequency of the innovation in the speech community, 
x', at the end of a time interval lasting 5t is the same as its value, x, at the start of 
that time interval. In other words, variant frequencies do not change on average. 

At this point it is necessary to return to the observation that there are fluctuations 
of order 1 /\/N around this average change in frequency. That is, in any real system 
we will expect to see small changes in variant frequencies from one time step to 
the next due to fluctuations in the identities of the speakers who interact, and their 
response to the interaction. However, the symmetries inherent in these interactions 
imply that the probabilities probabilities of upward and downward fluctuations are 
the same. Consequently, the small fluctuations in variant frequencies are undirected. 
Given enough time, it is possible for one of the variants to be eliminated by chance, 
at which point it will not (at least in the class of models under consideration here) be 
reinvented. The canonical mathematical model for this random processes with these 
characteristics is genetic drift that was introduced mathematically in the 1930s ||25] 
|26l. Typical trajectories of change generated by genetic drift are shown in Fig.|^ and 
can be seen to differ significantly from the directed S-curve of Fig. In particular, 
both fixed points (at x = 0 and x = 1) are stable, as one would expect if there is no 
underlying asymmetry. 

For what follows, it is perhaps worth emphasising the symmetries assumed in this 
analysis. First, all possible variants are considered equivalent. Further, all speakers 
are equivalent in terms of their propensity to change (all have the same a value). 
Dyads are symmetric: when speaker i interacts with speaker j, speaker j interacts 
with speaker i (and they interact in the same way). More subtly, we assumed that a 
speaker’s updated usage frequency would be some linear combination of its existing 
frequency and those if its interlocutors. As we will see below, nonlinear functions 
correspond to distinguishing between variants by their usage frequencies. A fully 
symmetric model would preclude giving a higher (or lower) weight to a more fre¬ 
quent variant. We explore the effect of relaxing each of these symmetries in the 
following sections. 
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A fully symmetric model would have all interaction frequencies Gij equal. Note, 
however, that we did not make this assumption in the previous section. Therefore, 
the main conclusion—that a variant’s frequency exhibits undirected fluctuations— 
should hold for arbitrary variation in G,y between pairs of speakers. It turns out that 
this is indeed the case, as was shown quite generally for a wide range of evolutionary 
processes (including cultural evolution) on complex network structures fUX . In fact, 
this insensitivity to network structure in the dynamics goes much deeper, in that the 
fluctuations in the usage frequency at the community level depend only on the 
size of the speech community, and not on the details of who speaks to whom and 
how often Il27ll28]l . This finding is reminiscent of the concept of universality as it 
applies to magnets, where the spatial arrangement of atoms in the solid did not affect 
its overall magnetic properties (see Section[2above). 

It would be somewhat unreasonable to expect every member of a speech com¬ 
munity to have the same number of interlocutors, and to interact with each of them 
with exactly equal frequency. Consequently, the fully symmetric theory of the pre¬ 
vious section has never (to my knowledge) been advanced as a linguistic theory for 
language change. However, the extension to the case where interaction frequencies 
can vary and speakers adopt a some average of their own and their interlocutor’s 



Fig. 2 Four change trajectories generated by genetic drift with an initial frequency of the innova¬ 
tion of 0.05. In each case, strong upward and downward fluctuations are observed, and no directed 
S-curve change trajectory (similar to that seen in Fig.[^ is seen. Three of the innovations go extinct 
(one rather quickly); one is still present at the end of the time period shown. 
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frequencies for future interactions has been advanced as the linguistic theory of 
determinism, at least as it applied to new-dialect formation under certain circum¬ 
stances 1^ . The psychological basis for this theory is accommodation, a process 
whereby speakers align themselves with their interlocutors for various reasons, for 
example, to increase their chance of being understood ll^ [3ll. If one looks infinitely 
far into the future, the outcome of a genetic drift process is quantitatively consis¬ 
tent with the fate of New Zealand English lISTI . However, this explanation relies on 
fluctuations to reach the state where all agents have adopted the innovation. Recall 
from the previous section that the magnitude of these fluctuations decreases with 
the speech community size. The work of lIZTl 1311 concludes that the timescale of 
change in such a theory increases with the speech community size in a way that is 
inconsistent with the rapid pace of language change seen in the example of New 
Zealand English, where the speech community was large. In order to see a more 
rapid change, or to see a directed change, a more powerful asymmetry is needed. 


5 Asymmetry in Social Attitudes 

One way to introduce further asymmetry is if speakers have different attitudes to¬ 
wards each others’ behaviour. In particular, in an interaction between speakers i and 
j, there is no particular reason why speaker i should give the same weight to speaker 
y’s utterances as the other way round. The question now is whether this asymmetry 
can generate the sustained directed growth of an innovation. 

Eor this to be possible, there must be some correlation between this asymmetry 
and the set of speakers who initially use the innovation. To understand why, con¬ 
sider the opposite case where there are no correlations between the influence that a 
speaker has an whether they initially use the innovation. The average influence of 
speakers who use the innovation is then, by definition, equal to the average influence 
of speakers who do not. This averaging out of influence then restores the symmetry 
between the variants, and thus one would not expect directed changes to arise. 

In models of innovation diffusion, some relationship between innovativeness and 
social influence is typically assumed (albeit with varying degrees of explicitness). 
Eor example, Rogers m refers to a group of ‘innovators’ who have influence over 
an ‘early majority’ who, in turn, have influence of a ’late majority’ and so on. In 
the sociolinguistics literature, there has been some discussion of social networks, 
focussing on the role that strong ties between individuals might play as a mechanism 
to preserve social norms, and how the number and quality of relationships between 
in governing how linguistic variation propagates (see e.g. imiii). Meanwhile, 
Labov a and Rogers ITJ further emphasise the important role played by specific 
individuals who have influence over other members of a social group when it comes 
to propagating an innovation. These all imply some sort of asymmetry in social 
influence. 

What we have found when incorporating social asymmetry into a model of lan¬ 
guage change is that an innovation which is initially used within a small group of 
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influential users can grow in frequency over a sustained period. However, the shape 
of the adoption curve depends somewhat on the details of how the social network is 
configured ini. This one can see by asking how the average usage frequency of the 
innovation changes in the presence of social asymmetry. 

To this end, let us return to the mathematical model of Sec.|^ and generalise it to 
the case of asymmetric interactions. This we shall do by redefining the quantity Gij. 
Previously, this was the probability that agents i and j interact in a time interval of 
length 5t. We now take it to be equal to the probability that agents i and j interact 
in this time interval and that agent i modifies its usage frequency in response to 
agent /s utterances. Clearly we no longer require Gij = Gji. If agent i seeks to 
emulate agent j more than the other way round, we will have Gjj > Gjf, otherwise 
the converse will be true. 

In terms of the mathematics, Eqs. 0 to Q are unaffected by this redefinition. 
However, the relationship ® crucially depended on the symmetry Gij = Gji. This 
time, we find instead from Q that 


X —x = 


I-a 

~W~ ■ 




\^Y,Y^^Gij-Gji){xj-Xi) (7) 


where we have twice used the fact that = T^iT^j^ifji by exchanging in¬ 

dices and reversing the order of summation. 

This expression shows that the interaction asymmetry Gij — Gji is crucial in de¬ 
termining the rate of change of the usage frequency x. First of all, we can confirm 
our intuition that where the language behaviour (encoded here by the differences 
Xj — Xi) is uncorrelated with the interaction asymmetries, the above sum will be of 
order 1 / ^/N, and consequently we expect the dynamics to be similar to the case of 
no asymmetry (see Section]^. 

Second, when Gij — Gji is positive, we see that the usage frequency tends to in¬ 
crease if speaker j is more innovative than speaker i (i.e., if Xj>Xi), and it tends to 
decrease otherwise. This is to be expected, since we have Gij > G ji when speaker 
i pays more attention to speaker j than vice versa. More significantly, this observa¬ 
tion has implications for the shape of an adoption curve when the frequency of an 
innovation is small. 

To see this, suppose initially that some speakers are innovators, and have x, = 1, 
whilst the remainder of the speech community are all categorical uses of the existing 
convention, and have Xi = 0. Suppose also that the innovators exert influence over 
non-innovators. Then, for this initial condition we have 

x'-x= - Gji) (8) 


where here the notation [ij] refers to ordered pairs i,j such that speaker j is an 
innovator and speaker i is not. The statement that the innovators exert influence 
over non-innovators implies that Gij > G ji for all such pairs. Hence, x' — x is a 
strictly positive quantity even with small numbers of innovators: that is, there is 
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some positive rate of growth at low innovation frequencies. Simulation results nil 
show that this initial growth can be sufficiently rapid to be inconsistent with an S- 
shaped adoption curve. In particular, it was shown that the initial period of slow 
growth could only be realised under conditions where the size of each successive 
group in the chain of adopters increased exponentially along the chain ifTTll . As far 
as we are aware, this does not match any known population structure. 

In summary, when one relies on asymmetry in social attitudes to drive the adop¬ 
tion of an innovation, universality in the third (physics) sense does not apply. The 
details of the network of social influences matter, at least in terms of the initial shape 
of the adoption curve. Therefore, this type of asymmetry does not provide a robust 
explanation for the universal S-shaped trajectory of language change. 


6 Asymmetry in the Variants 

It turns out that a robust explanation for the S-curve is provided by an asymmetry 
in speakers’ attitudes towards the linguistic behaviour itself rather than its users. To 
model this, we now introduce an explicit bias f{xj) into agent fs estimate of its 
usage frequency among its interlocutors. Specifically, we now take 


. I.i^iGij[xj+f{xj)\ 
Xj = — - ---- 


(9) 


instead of the unbiased expression (|^. Whenever the bias f{xj) is positive, the fre¬ 
quency of the innovation is over-estimated relative to its actual value; likewise when 
it is negative, the frequency is under-estimated. We impose two constraints on the 
form of f{xj). First, we insist that it vanishes when Xj = 0 or when Xj = 1. This is 
to be consistent with our approach, in which the innovation process is implicit in 
the initial condition: if we did not have /(O) = /(I) =0, the innovation would be 
spontaneously recreated if it goes extinct. We also insist that 0 < Xj + f{xj) < 1 for 
all Xj, so that it can be interpreted as a frequency in the same way as xj. 

If we take this variant-based asymmetry to be the sole asymmetry in the system, 
we will have G,y = Gji, as in Section]^ Using the above expression for Xj in Eq. Q 
we And that 

x'-x = ^ Gif{xi) . (10) 

We can now perform the same experiment as in the previous section, where we as¬ 
sign Xi = 1 to a group of innovators, and v, = 0 to a group of conformists and ask for 
the initial shape of the change trajectory. Since f{xi) = 0 in both cases, we And that 
x' = X, showing that with this initial condition, the frequency of the innovation can 
change only through a fluctuation. This is, however, not the same as the fluctuation- 
driven dynamics that arises when the dynamics are fully symmetric (as described 
in Section]^. There, x' = x no matter what the individual usage frequencies xi are. 
Here, x' =x only if all usage is categorical: as soon as some individuals show vari- 
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able behaviour (for example, as they start to adopt the innovation), we will have 
x' > x in the case where the bias acts in favour of the innovation (i.e., if f(x) > 0). 

It is worth emphasising the crucial difference between the initial growth of an 
innovation arising from speaker-based asymmetry (Section]^ and the variant-based 
asymmetry just described. For speaker-based asymmetry, we found that with a small 
number of categorical innovators, the mean growth rate of the innovation is nonzero, 
which corresponds to a rapid initial growth. Here, for variant-based asymmetry, 
we found under the same conditions that the mean growth rate of the innovation 
vanishes, and so a slower initial growth arises. These expectations were confirmed 
with an explicit model in ifTTll . which led us to hypothesise that this variant-based 
asymmetry is a crucial component of language change in real speech communities. 


7 Asymmetry in Variant Frequencies 

The foregoing does not cover all possible asymmetries that might exist in linguistic 
behaviour. In particular, one way in which variants could be discriminated is through 
their frequencies alone, without reference to any aspect of the behaviour itself or 
association with its users. This is actually a specific type of variant asymmetry, and 
as such can be modelled through an appropriate choice of the function f{x) that was 
introduced in the previous section. 

Suppose there are just two variants in competition with each other. Although 
we will allow the bias f{x) to vary with frequency, we will do so in a way that is 
symmetric with respect to the variants: that is, the boost applied to a variant with 
some specific frequency xq is the same, regardless of which variant this is. In the 
two-variant case, the two frequencies are x and 1 — x. The symmetry between them 
implies that the function /(x) must satisfy the constraint 

/(l-x) = -/(x). (11) 

Again, we will require that /(O) = /(1) = 0, so that any innovation is innovated only 
once, and all subsequent adoption of the imitation arises from social interactions. 

The simplest functions that satisfy these requirements are /(x) = 0 and /(x) = 
x(l — x)(2x— 1), which corresponds to boosting a variant if it is a majority variant, 
and suppressing it if it is in the minority. This type of frequency boosting, or reg- 
ularisation, has been observed in a variety of frequency learning experiments, both 
in the linguistic and non-linguistic domain Elin]. A difficulty with this type of 
model is that there is a threshold problem: low frequency variants face an uphill 
struggle to reach a frequency of 50%, which is needed for the regularisation bias to 
act in their favour. The presence of noise complicates matters. If the magnitude of 
any fluctuations is small, this reasoning (based primarily on deterministic consid¬ 
erations) continues to hold. However, when fluctuations are large there is in fact a 
transition into a regime where the regularisation bias is suppressed, and the usage 
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frequencies fluctuate in the same way as in the fully symmetric case described in 
Sectionj^l^. 

This suggests that the main mechanism for propagating an innovation along an S- 
shaped adoption curve is if the function f{x) is positive for all x, thereby providing a 
systematic bias in favour of the innovation at all frequencies. However, this raises the 
question of where this bias comes from. In mi, it is suggested that speaker-based 
asymmetry provides a means for speakers to create an association between a group 
of speakers and a particular linguistic behaviour. Once this happens, a variant-based 
asymmetry whose origins lie in speaker-based asymmetry may arise. Whether this 
scenario can be realised spontaneously through local interactions in an agent-based 
model the subject of a current investigation ||36l. 

Another possibility, raised in llJTl . is to distinguish between variants not in terms 
of their current usage frequencies, but according to whether they are increasing or 
decreasing. A ‘momentum-based’ bias lIMl towards further increasing the frequency 
of a variant which has been increasing in the past could in principle propagate an 
innovation without appealing to a bias that is based on speaker identity. Again, the 
question of whether this can arise purely through local interactions between speak¬ 
ers is being investigated with reference to an agent-based model 13^ . 


8 Discussion 

In this short article, we have explored the various notions of universality in language 
change. Drawing inspiration from the relationship between symmetry and universal¬ 
ity in physics, we have appealed to symmetry as a means to categorise theories for 
language change. Specifically, we identified the following sources of asymmetry in 
models of language change; variation in interaction frequencies alone (which corre¬ 
sponds to the theories of accommodation and determinism asymmetry in the 
degree of influence that speakers have over each other (which correspond to theories 
based on social network effects, propounded for example by Bloomfield iHOl . Labov 
a, Milroy ll^ and others); variation in the attitude towards different linguistic 
variants (which correspond to theories based on prestige and related social factors, 
advanced for example by Sturtyvant ED, Labov a and enjoys some prominence 
among sociolinguists); and Anally asymmetry that is based on the usage frequencies 
of variants (such as regularisation effects ll34ll and momentum-based explanations 
for change IHl). 

The key message is that only some of these distinct sources of asymmetry are 
compatible with the widely-observed (‘universal’) S-shaped curve for the adoption 
of an innovation. We found that a robust model that generates the S-shaped curve can 
be achieved with a prestige-based explanation (i.e., different attitudes to particular 
ways of speaking) or potentially with a momentum-based explanation ll37l[^ . In 
this work, this was determined primarily by investigating the initial rate of growth 
of an innovation within a fairly general mathematical framework, complementing 
existing studies that were based on specific simulation models. 
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A crucial open question that remains is the following. Appealing to symmetries 
is useful as it allows broad classes of explanations for language change to be ex¬ 
cluded with reference only to qualitative features of empirical data. However, this 
is insufficient to identify a single theory for language change: more than one is 
compatible with the qualitative data. The challenge then is to distinguish between 
these remaining theories. One particular issue with a social prestige type explana¬ 
tion is how the bias towards one linguistic variant over another becomes embedded 
in the speech community. In ifTTll it was found that a majority of speakers should be 
positively disposed towards the innovation: how does this positive disposition itself 
spread through the speech community? The momentum-based theory of llJTl [W1 
potentially side-steps this issue, since the variants are distinguished by their usage 
history. If different members of the speech community agree that an innovation is 
becoming more prevalent, they will all boost the frequency of the same variant. 
Whilst this is perhaps a more parsimonious theory, that is not in itself sufficient 
to conclude that it is the more appropriate one. Instead, some independent empir¬ 
ical evidence in favour of a specific explanation is needed. Even better would be 
to demonstrate that the favoured theory shows greater quantitative agreement with 
empirical data at both the individual and population level. The complexity of human 
behaviour and social interactions is such that this will be a challenging task, but one 
where sustained research effort would certainly be worthwhile. 
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